(Exponentiated) Stochastic Gradient Descent for L1 Constrained Problems

نویسنده

  • Ambuj Tewari
چکیده

This note is by Sham Kakade, Dean Foster, and Eyal Even-Dar. It is intended as an introductory piece on solving L1 constrained problems with online methods. Convex optimization problems with L1 constraints frequently underly solving such tasks as feature selection problems and obtaining sparse representations. This note shows that the exponentiated gradient algorithm (of Kivinen and Warmuth (1997)) when used as a stochastic gradient descent algorithm is quite effective as an optimization tool under general convex loss functions — requiring a number of gradient steps that is logarithmic in the number of dimensions under mild assumptions. In particular, for supervised learning problems in which we desire to approximately minimize some general convex loss (including the square, logistic, hinge, or absolute loss) in the presence of many irrelevant features, this algorithm is efficient — with a sample complexity that is only logarithmic in the total number of features and a computational complexity that is only linear in the total number of features (ignoring log factors).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty

Stochastic gradient descent (SGD) uses approximate gradients estimated from subsets of the training data and updates the parameters in an online fashion. This learning framework is attractive because it often requires much less training time in practice than batch training algorithms. However, L1-regularization, which is becoming popular in natural language processing because of its ability to ...

متن کامل

Seismic impedance inversion using l1-norm regularization and gradient descent methods

We consider numerical solution methods for seismic impedance inversion problems in this paper. The inversion process is ill-posed. To tackle the ill-posedness of the problem and take the sparsity of the reflectivity function into consideration, an l1 norm regularization model is established. In computation, a nonmonotone gradient descent method based on Rayleigh quotient for solving the minimiz...

متن کامل

A Light Touch for Heavily Constrained SGD

Projected stochastic gradient descent (SGD) is often the default choice for large-scale optimization in machine learning, but requires a projection after each update. For heavily-constrained objectives, we propose an efficient extension of SGD that stays close to the feasible region while only applying constraints probabilistically at each iteration. Theoretical analysis shows a good trade-off ...

متن کامل

Generalization Error Bounds for Aggregation by Mirror Descent with Averaging

We consider the problem of constructing an aggregated estimator from a finite class of base functions which approximately minimizes a convex risk functional under the l1 constraint. For this purpose, we propose a stochastic procedure, the mirror descent, which performs gradient descent in the dual space. The generated estimates are additionally averaged in a recursive fashion with specific weig...

متن کامل

Exponentiated Gradient Algorithms for Conditional Random Fields and Max-Margin Markov Networks

Log-linear and maximum-margin models are two commonly-used methods in supervised machine learning, and are frequently used in structured prediction problems. Efficient learning of parameters in these models is therefore an important problem, and becomes a key factor when learning from very large data sets. This paper describes exponentiated gradient (EG) algorithms for training such models, whe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008